Linguistic Classification Using Instance-Based Learning
نویسندگان
چکیده
Abstract Traditionally, linguists have organized languages of the world as language families, such Indo-European, Dravidian and Sino-Tibetan. Within Indo-European family, they further into sub-families Germanic, Celtic Indo-Iranian. They do this by looking at similar-sounding words across commonality rules word formation sentence construction. In work, we make use computational approaches that are more scalable. More importantly, contest tree-based structure family models follow, which feel is rather constraining comes in way natural discovery relationships between any two languages. For example, affinity Sanskrit has with Irish, Iranian or English better illustrated using a network model. Similarly, Indian inter-relationships go beyond confines Indo-Aryan divide. To enable languages, paper, instance-based learning techniques to assign labels words. Our approach comprises building corpus then applying clustering construct training set. Following this, vocalized classified making custom linguistic distance metric. We considered seven namely Kannada, Marathi, Punjabi, Hindi, Tamil, Telugu Sanskrit. believe our work potential usher new era linguistics India.KeywordsLinguisticsAryan invasion theoryOut India theorySoundex scoreInstance-based learningKNNDBSCANClusteringClustering coefficient
منابع مشابه
Multiple Instance Learning-Based Birdsong Classification Using Unsupervised Recording Segmentation
Traditional techniques for monitoring wildlife populations are temporally and spatially limited. Alternatively, in order to quickly and accurately extract information about the current state of the environment, tools for processing and recognition of acoustic signals can be used. In the past, a number of research studies on automatic classification of species through their vocalizations have be...
متن کاملWeighted Instance-Based Learning Using Representative Intervals
Instance-based learning algorithms are widely used due to their capacity to approximate complex target functions; however, the performance of this kind of algorithms degrades significantly in the presence of irrelevant features. This paper introduces a new noise tolerant instance-based learning algorithm, called WIB-K, that uses one or more weights, per feature per class, to classify integer-va...
متن کاملSupervised Learning Using Instance-based Patterns
This paper introduces a new classification algorithm of the instance-based learning type. Training records are converted into patterns associated with a known class label, and stored permanently into a trie1-like tree structure along with other helpful information. Classifying new records is done selecting from the trie two best patterns as solutions hypotheses. Best pattern selection is done u...
متن کاملLearning Instance Specific Distance for Multi-Instance Classification
Multi-Instance Learning (MIL) deals with problems where each training example is a bag, and each bag contains a set of instances. Multi-instance representation is useful in many real world applications, because it is able to capture more structural information than traditional flat single-instance representation. However, it also brings new challenges. Specifically, the distance between data ob...
متن کاملMultiresolution Instance-Based Learning
Instance-based learning methods explicitly remem ber all the data that they receive They usually have no training phase and only at prediction time do they perform computation Then they take a query search the database for similar datapoints and build an on-line local model (such as a local average or local regression) with which to predict an output value In this paper we review the advantage...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture notes on data engineering and communications technologies
سال: 2022
ISSN: ['2367-4520', '2367-4512']
DOI: https://doi.org/10.1007/978-981-16-9113-3_63